Variational MCMC

نویسندگان

  • Nando de Freitas
  • Pedro A. d. F. R. Højen-Sørensen
  • Stuart J. Russell
چکیده

We propose a new class of learning algorithms that combines variational approximation and Markov chain Monte Carlo (MCMC) simu­ lation. Naive algorithms that use the vari­ ational approximation as proposal distribu­ tion can perform poorly because this approx­ imation tends to underestimate the true vari­ ance and other features of the data. We solve this problem by introducing more so­ phisticated MCMC algorithms. One of these algorithms is a mixture of two MCMC ker­ nels: a random walk Metropolis kernel and a block Metropolis-Hastings (MH) kernel with a variational approximation as proposal dis­ tribution. The MH kernel allows one to lo­ cate regions of high probability efficiently. The Metropolis kernel allows us to explore the vicinity of these regions. This algorithm outperforms variational approximations be­ cause it yields slightly better estimates of the mean and considerably better estimates of higher moments, such as covariances. It also outperforms standard MCMC algorithms be­ cause it locates the regions of high proba­ bility quickly, thus speeding up convergence. We also present an adaptive MCMC algo­ rithm that iterates between improving the variational approximation and improving the MCMC approximation. We demonstrate the algorithms on the problem of Bayesian pa­ rameter estimation for logistic (sigmoid) be­ lief networks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Divergence Bound for Hybrids of MCMC and Variational Inference and an Application to Langevin Dynamics and SGVI

Two popular classes of methods for approximate inference are Markov chain Monte Carlo (MCMC) and variational inference. MCMC tends to be accurate if run for a long enough time, while variational inference tends to give better approximations at shorter time horizons. However, the amount of time needed for MCMC to exceed the performance of variational methods can be quite high, motivating more fi...

متن کامل

Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies

The Bayesian approach to variable selection in regression is a powerful tool for tackling many scientific problems. Inference for variable selection models is usually implemented using Markov chain Monte Carlo (MCMC). Because MCMC can impose a high computational cost in studies with a large number of variables, we assess an alternative to MCMC based on a simple variational approximation. Our ai...

متن کامل

Variational method for estimating the rate of convergence of Markov-chain Monte Carlo algorithms.

We demonstrate the use of a variational method to determine a quantitative lower bound on the rate of convergence of Markov chain Monte Carlo (MCMC) algorithms as a function of the target density and proposal density. The bound relies on approximating the second largest eigenvalue in the spectrum of the MCMC operator using a variational principle and the approach is applicable to problems with ...

متن کامل

An Introduction to Bayesian Inference Via Variational Approximations∗

Markov Chain Monte Carlo (MCMC) methods have facilitated an explosion of interest in Bayesian methods. MCMC is an incredibly useful and important tool, but can face difficulties when used to estimate complex posteriors or models applied to large data sets. In this paper I show how a recently developed tool in computer science for fitting Bayesian models, variational approximations, can be used ...

متن کامل

Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo

Deep latent Gaussian models are powerful and popular probabilistic models of highdimensional data. These models are almost always fit using variational expectationmaximization, an approximation to true maximum-marginal-likelihood estimation. In this paper, we propose a different approach: rather than use a variational approximation (which produces biased gradient signals), we use Markov chain M...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001